20 research outputs found

    Content-based access to spoken audio

    Get PDF
    The amount of archived audio material in digital form is increasing rapidly, as advantage is taken of the growth in available storage and processing power. Computational resources are becoming less of a bottleneck to digitally record and archive vast amounts of spoken material, both television and radio broadcasts and individual conversations. However, listening to this ever-growing amount of spoken audio sequentially is too slow, and the bottleneck will become the development of effective ways to access content in these voluminous archives. The provision of accurate and efficient computer-mediated content access is a challenging task, because spoken audio combines information from multiple levels (phonetic, acoustic, syntactic, semantic and discourse). Most systems that assist humans in accessing spoken audio content have approached the problem by performing automatic speech recognition, followed by text-based information access. These systems have addressed diverse tasks including indexing and retrieving voicemail messages, searching for broadcast news, and extracting information from recordings of meetings and lectures. Spoken audio content is far richer than what a simple textual transcription can capture as it has additional cues that disclose the intended meaning and speaker’s emotional state. However, the text transcription alone still provides a great deal of useful information in applications. This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also discuss the main application domains and try to identify important issues for future developments. The structure of the article is based on general system architecture for content-based access which is depicted in Figure 1. Although the tasks within each processing stage may appear unconnected, the interdependencies and the sequence with which they take place vary

    Transcription and Summarization of Voicemail Speech

    Get PDF
    This paper describes the development of a system to transcribe and summarize voicemail messages. The results of the research we present are two-fold. First, a hybrid connectionist approach to the Voicemail transcription task shows that competitive performance can be achieved using a context-independent system with fewer parameters than those based on mixtures of Gaussian likelihoods. Second, an effective and robust combination of statistical with prior knowledge sources for term weighting is used to extract information from the decoder’s output in order to deliver summaries to the message recipients via a GSM Short Message Service (SMS) gateway

    Evaluation of extractive voicemail summarization.

    Get PDF
    This paper is about the evaluation of a system that generates short text summaries of voicemail messages, suitable for transmission as text messages. Our approach to summarization is based on a speech-recognized transcript of the voicemail message, from which a set of summary words is extracted. The system uses a classifier to identify the summary words, with each word being identified by a vector of lexical and prosodic features. The features are selected using Parcel, an ROC-based algorithm. Our evaluations of the system, using a slot error rate metric, have compared manual and automatic summarization, and manual and automatic recognition (using two different recognizers). We also report on two subjective evaluations using mean opinion score of summaries, and a set of comprehension tests. The main results from these experiments were that the perceived difference in quality of summarization was affected more by errors resulting from automatic transcription, than by the automatic summarization process

    The role of prosody in a voicemail summarization system

    Get PDF
    When a speaker leaves a voicemail message there are prosodic cues that emphasize the important points in the message, in addition to lexical content. In this paper we compare and visualize the relative contribution of these two types of features within a voicemail summarization system. We describe the system's ability to generate summaries of two test sets, having trained and validated using 700 messages from the IBM Voicemail corpus. Results measuring the quality of summary artifacts show that combined lexical and prosodic features are at least as robust as combined lexical features alone across all operating conditions

    Extractive Summarization of Voicemail using Lexical and Prosodic Feature Subset Selection

    Get PDF
    This paper presents a novel data-driven approach to summarizing spoken audio transcripts utilizing lexical and prosodic features. The former are obtained from a speech recognizer and the latter are extracted automatically from speech waveforms. We employ a feature subset selection algorithm, based on ROC curves, which examines different combinations of features at different target operating conditions. The approach is evaluated on the IBM Voicemail corpus, demonstrating that it is possible and desirable to avoid complete commitment to a single best classifier or feature set

    Automatic summarization of voicemail messages using lexical and prosodic features

    Get PDF
    This article presents trainable methods for extracting principal content words from voicemail messages. The short text summaries generated are suitable for mobile messaging applications. The system uses a set of classifiers to identify the summary words with each word described by a vector of lexical and prosodic features. We use an ROC-based algorithm, Parcel, to select input features (and classifiers). We have performed a series of objective and subjective evaluations using unseen data from two different speech recognition systems as well as human transcriptions of voicemail speech

    Fournier's gangrene in a patient after third-degree burns: a case report

    Get PDF
    <p>Abstract</p> <p>Introduction</p> <p>Fournier's gangrene is characterized by tissue ischemia leading to rapidly progressing necrotizing fasciitis.</p> <p>Case presentation</p> <p>We present the case of a patient with Fournier's gangrene after third-degree burns. Clinical manifestations, laboratory results and treatment options are discussed.</p> <p>Conclusion</p> <p>Fournier's gangrene is a surgical emergency. Although it can be lethal, it is still a challenging situation in the field of surgical infections.</p

    Advances in Profile Assisted Voicemail Management

    No full text
    Abstract. Spoken audio is an important source of information available to knowledge extraction and management systems. Organization of spoken messages by priority and content can facilitate knowledge capture and decision making based on profiles of recipients as these can be determined by physical and social conditions. This paper revisits the above task and addresses a related data sparseness problem. We propose a methodology according to which the coverage of language models used to categorize message types is augmented with previously unobserved lexical information derived from other corpora. Such lexical information is the result of combining word classes constructed by an agglomerative clustering algorithm which follows a criterion of minimum loss in average mutual information. We subsequently generate more robust category estimators by interpolating class-based and voicemail word-based models
    corecore